Adaptive linear quadratic control using policy iteration - American Control Conference, 1994

نویسنده

  • B. Erik Ydstie
چکیده

In this paper we present stability and convergence results for Dynamic Programming-based reinforcement learning applied to Linear Quadratic Regulation (LQR). The specific algorithm we analyze is based on Q-learning and it is proven to converge to the optimal controller provided that the underlying system is controllable and a particular signal vector is persistently excited. This is the first convergence result for DP-based reinforcement learning algorithms €or a continuous problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Linear Quadratic Control Using Policy Iteration

In this paper we present stability and convergence results for Dynamic Programming-based reinforcement learning applied to Linear Quadratic Regulation (LQR). The spe-ciic algorithm we analyze is based on Q-learning and it is proven to converge to the optimal controller provided that the underlying system is controllable and a particular signal vector is persistently excited. The performance of ...

متن کامل

Optimization of Markov jump linear system with controlled jump probabilities of modes

The optimal control p roblem of Markov jump linear quadratic model with controlled jump probabilities of modes is investigated. Two kinds of mode control policies , open2loop control policy and close2loop control policy , are considered. By using policy iteration and performance potential concept , a sufficient condition for the optimal close2 loop control policy being better than the optimal o...

متن کامل

Optimal adaptive leader-follower consensus of linear multi-agent systems: Known and unknown dynamics

In this paper, the optimal adaptive leader-follower consensus of linear continuous time multi-agent systems is considered. The error dynamics of each player depends on its neighbors’ information. Detailed analysis of online optimal leader-follower consensus under known and unknown dynamics is presented. The introduced reinforcement learning-based algorithms learn online the approximate solution...

متن کامل

Greedy Adaptive Critics for LQR Problems: Convergence Proofs

A number of success stories have been told where reinforcement learning has been applied to problems in continuous state spaces using neural nets or other sorts of function approximators in the adaptive critics. However, the theoretical understanding of why and when these algorithms work is inadequate. This is clearly exempliied by the lack of convergence results for a number of important situa...

متن کامل

Optimal Adaptive Control for a Class of Stochastic Systems - American Control Conference, Proceedings of the 1995

We study linear-quadratic adaptive tracking problems for a special class of stochastic systems expressed in the state-space form. This is a longstanding problem in the control of aircraft flying through atmospheric turbulence. Using an ELS-based algorithm and introducing dither in the control law we show that the resulting control achieves optimal cost in the limit, while simultaneously the unk...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004